Search results for "DATA MINING"
showing 10 items of 907 documents
A Regional Geography Approach to Understanding the Environmental Changes as a Consequence of the COVID-19 Lockdown in Highly Populated Spanish Cities
2021
Spain has been highly impacted by the COVID-19 pandemic, which is reflected at different scales. From an economic point of view, lockdowns and the reduction of activities have damaged the country (e.g., complete lockdown from March 13 to June 21, 2020). However, it is not clear if the associated environmental impacts could be observed in 2020. Currently, studies on the effects of the lockdown (e.g., decrease in economic activities, transport and social communication) on specific parameters related to climate change, such as air temperature or air pollution, due to a drastic decrease in human activities are rare. They are focused on specific cities and short periods of time. Therefore, the m…
Transferring deep learning models for cloud detection between Landsat-8 and Proba-V
2020
Abstract Accurate cloud detection algorithms are mandatory to analyze the large streams of data coming from the different optical Earth observation satellites. Deep learning (DL) based cloud detection schemes provide very accurate cloud detection models. However, training these models for a given sensor requires large datasets of manually labeled samples, which are very costly or even impossible to create when the satellite has not been launched yet. In this work, we present an approach that exploits manually labeled datasets from one satellite to train deep learning models for cloud detection that can be applied (or transferred) to other satellites. We take into account the physical proper…
Hyperspectral dimensionality reduction for biophysical variable statistical retrieval
2017
Abstract Current and upcoming airborne and spaceborne imaging spectrometers lead to vast hyperspectral data streams. This scenario calls for automated and optimized spectral dimensionality reduction techniques to enable fast and efficient hyperspectral data processing, such as inferring vegetation properties. In preparation of next generation biophysical variable retrieval methods applicable to hyperspectral data, we present the evaluation of 11 dimensionality reduction (DR) methods in combination with advanced machine learning regression algorithms (MLRAs) for statistical variable retrieval. Two unique hyperspectral datasets were analyzed on the predictive power of DR + MLRA methods to ret…
Optimizing Gaussian Process Regression for Image Time Series Gap-Filling and Crop Monitoring
2020
Image processing entered the era of artificial intelligence, and machine learning algorithms emerged as attractive alternatives for time series data processing. Satellite image time series processing enables crop phenology monitoring, such as the calculation of start and end of season. Among the promising algorithms, Gaussian process regression (GPR) proved to be a competitive time series gap-filling algorithm with the advantage of, as developed within a Bayesian framework, providing associated uncertainty estimates. Nevertheless, the processing of time series images becomes computationally inefficient in its standard per-pixel usage, mainly for GPR training rather than the fitting step. To…
Geomeasure: GIS and Scripting for Measuring Morphometric Variability
2019
This paper presents Geomeasure, a methodological tool developed to recover typometric information with a twofold objective. First, to speed up the process of gathering data by automatizing the way in which it is recovered. Second, it adds higher accuracy and the possibility of re-measuring archeological items without further directly interacting with the piece. Based on a combination of R scripting with GIS features, Geomeasure is at the time able to automatically gather 125–130 typometric variables per archaeological item, with the only input of vectorized photographs. It can be used as a reliable methodological aid to extract detailed information on patterns and trends of shape variabilit…
X!TandemPipeline: a tool to manage sequence redundancy for protein inference and phosphosite identification
2017
X!TandemPipeline is a software designed to perform protein inference and to manage redundancy in the results of phosphosite identification by database search. It provides the minimal list of proteins or phosphosites that are present in a set of samples using grouping algorithms based on the principle of parsimony. Regarding proteins, a two-level classification is performed, where groups gather proteins sharing at least one peptide and subgroups gather proteins that are not distinguishable according to the identified peptides. Regarding phosphosites, an innovative approach based on the concept of phosphoisland is used to gather overlapping phosphopeptides. The graphical interface of X!Tandem…
Model‐based approaches to unconstrained ordination
2014
Summary Unconstrained ordination is commonly used in ecology to visualize multivariate data, in particular, to visualize the main trends between different sites in terms of their species composition or relative abundance. Methods of unconstrained ordination currently used, such as non-metric multidimensional scaling, are algorithm-based techniques developed and implemented without directly accommodating the statistical properties of the data at hand. Failure to account for these key data properties can lead to misleading results. A model-based approach to unconstrained ordination can address this issue, and in this study, two types of models for ordination are proposed based on finite mixtu…
Spatio-Temporal model structures with shared components for semi-continuous species distribution modelling
2017
Abstract Understanding the spatio-temporal dynamism and environmental relationships of species is essential for the conservation of natural resources. Many spatio-temporally sampled processes result in continuous positive [ 0 , ∞ ) abundance datasets that have many zero values observed in areas that lie outside their optimum niche. In such cases the most common option is to use two-part or hurdle models, which fit independent models and consequently independent environmental effects to occurrence and conditional-to-presence abundance. This may be correct in some cases, but not as much in others where the detection probability is related to the abundance. The aim of this work is to infer the…
Tracking the outbreak. An optimized delimiting survey strategy for Xylella fastidiosa
2020
SummaryCurrent legislation enforces the implementation of intensive surveillance programs for quarantine plant pathogens. After an outbreak, surveys are implemented to delimit the geographic extent of the pathogen and execute disease control. The feasibility of control programs is highly dependent on budget availability, thus it is necessary to target and optimize surveillance strategies.A sequential adaptive delimiting survey involving a three-phase and a two-phase design with increasing spatial resolution was developed and implemented for the Xylella fastidiosa outbreak in Alicante, Spain. Inspection and sampling intensities were optimized using simulation-based methods and results were v…
Rings for Privacy: an Architecture for Large Scale Privacy-Preserving Data Mining
2021
This article proposes a new architecture for privacy-preserving data mining based on Multi Party Computation (MPC) and secure sums. While traditional MPC approaches rely on a small number of aggregation peers replacing a centralized trusted entity, the current study puts forth a distributed solution that involves all data sources in the aggregation process, with the help of a single server for storing intermediate results. A large-scale scenario is examined and the possibility that data become inaccessible during the aggregation process is considered, a possibility that traditional schemes often neglect. Here, it is explicitly examined, as it might be provoked by intermittent network connec…